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Abstract: 

We develop a new elementary move for simulations of polymer chains in torsion angle 
space. The method is flexible and easy to implement. Tentative updates are drawn 
from a (conformation- dependent) Gaussian distribution that favors approximately 
local deformations of the chain. The degree of bias is controlled by a parameter 
b. The method is tested on a reduced model protein with 54 amino acids and the 
Ramachandran torsion angles as its only degrees of freedom, for different b. Without 
excessive fine tuning, we find that the effective step size can be increased by a factor 
of three compared to the unbiased 6 = case. The method may be useful for kinetic 
studies, too. 
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1 Introduction 



Kinetic simulations of protein folding are notoriously difficult. Thermodynamic sim- 
ulations may use unphysical moves and are therefore potentially easier, but existing 
methods need improvement. Three properties that a successful thermodynamic al- 
gorithm must possess are as follows. First and foremost, it must be able to alleviate 
the multiple- minima problem. Methods like the multicanonical algorithm and 
simulated tempering try to do so by the use of generalized ensembles. Second, 
it must provide an efficient evolution of large-scale properties of unfolded chains. The 
simple pivot method does remarkably well in that respect. Third, it must be 
able to alter local properties of folded chains without causing too drastic changes in 
their global structure. This paper is concerned with the third problem, which is im- 
portant if the backbone potentials are stiff and especially if the mobility is restricted 
to the biologically most relevant torsional degrees of freedom. 



An update that rearranges a restricted section of the chain without affecting the 
remainder is local. For chains with flexible or semifiexible backbones, there exists 
a variety of local updates, ranging from simple single-site moves to more elaborate 
methods H^12| where inner sections are removed and then regrown site by site in 
a configurational-bias manner |TB|,|T^. However, these methods break down if bond 
lengths and bond angles are completely rigid. 



The problem of generating local deformations of chains with only torsional degrees of 
freedom was analyzed in a classic paper by Go and Scheraga [l^ . Based on this anal- 
ysis, Dodd et al. |T^ devised the first proper Monte Carlo algorithm of this type, the 
concerted-rotation method. This method works with seven adjacent torsion angles 
along the chain. One of these angles is turned by a random amount. Possible values 
of the remaining six angles are then determined by numerically solving a set of equa- 
tions that guarantee that the move is local. The new conformation is finally drawn 
from the set of all possible solutions to this so-called rebridging problem. Variations 
and generalizations of this method have been discussed by several groups [|T^-|TP[. 
There are also methods p0|-|2^ that combine elements of the configurational-bias 
and concerted-rotation approaches. One of these methods uses an analytical re- 



bridging scheme, inspired by the solution for a similar problem in robotic control |25 



The concerted-rotation approach is a powerful method that can generate large local 
deformations by finding the discrete solutions to the rebridging problem. However, 
the method is not easy to implement and large local deformations may be difficult 
to accomplish if, for example, the chain is folded and has bulky side groups. Hence, 
there are situations where this method is not the obvious choice. 
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In this paper, we discuss a different and less sophisticated type of Monte Carlo move 
in torsion angle space. This algorithm is by nature a "small-step" algorithm so large 
local deformations cannot take place. Drastic global changes would still occur if the 
steps were random. To avoid that, a biasing probability is introduced. The method 
becomes approximately local if the bias is made strong. Compared to a strictly local 
update, this method has the disadvantage that a much smaller part of the energy 
function is left unchanged, so the CPU time per update is larger. However, this 
problem is not too severe for moderate chain lengths. Moreover, both our method 
and strictly local ones are typically combined with some truly nonlocal update like 
pivot, and such an update is not faster than ours. 

The algorithm proceeds as follows. We consider n torsion angles 0i, where n = 8 in 
our calculations. To update these angles, we introduce a conformation-dependent n x 
n matrix G such that dcp^Gdcj) ~ for changes 50 = (50i, . . . , 50n) that correspond 
to local deformations. The steps ^"^^ then drawn from the Gaussian distribution 



P{S(j)) oc exp 



■^6^^{1 + bG)5^ 



where 1 denotes the n x n unit matrix and a and b are tunable parameters. The 
parameter a controls the acceptance rate, whereas b sets the degree of bias. The new 
conformation is finally subject to an accept/reject step. Important to the implemen- 
tation of the algorithm is that the matrix G is non-negative and symmetric. Hence, 
it is possible to take the "square root" of 1 + bG, which facilitates the calculations. 



This method, which is quite general, is tested on a reduced model protein with 54 
amino acids and the Ramachandran torsion angles as its only degrees of freedom. This 
chain forms a three-helix bundle in its native state and exhibits an abrupt collapse 
transition that coincides with its folding transition. The performance of the method 
is studied both above and below the folding temperature, for different values of the 
parameters a and b. For a suitable choice of b, we find that the effective step size 
can be increased by a factor of three in the folded phase, compared to the unbiased 
6 = case. The optimal value of b corresponds to a relatively strong bias, that is an 
approximately local update. 



2 The Model 



In our calculations, we consider a reduced protein model [26| where each amino acid 



is represented by five or six atoms. The three backbone atoms N, Cq and C are all 
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included, whereas the side chain is represented by a single atom, C^g. The C/j atom 
can be hydrophobic, polar or absent, which means that there are three different types 
of amino acids in the model. For a schematic illustration of the chain representation, 
see Fig. |I]. 

All bond lengths, bond angles and peptide torsion angles (180°) are held fixed, which 
leaves us with two degrees of freedom per amino acid, the Ramachandran torsion 
angles (see Fig. 0). 

The energy function 

E = E\oc + -Esa + -E'hb + Eaa (2) 

is composed of four terms. The local potential ii^ioc has a standard form with threefold 
symmetry, 

^ioc = ^E(l + cos3</>,). (3) 

i 

The self-avoidance term E^^ is given by a hard-sphere potential of the form 

i?sa = 6.. E' (4) 

where the sum runs over all possible atom pairs except those consisting of two hy- 
drophobic Cp. The hydrogen-bond term _Ehb is given by 

-Ehb = Chb E u{rij)v{aij, Pij) , (5) 
where i and j represent H and O atoms (see Fig. respectively, and 

v^a,j,p,,) - I otherwise 

In these equations, denotes the HO distance, the NHO angle, and [3ij the HOC 
angle. Finally, the hydrophobicity term Ep^x has the form 



-E'aa = caa E 



where both i and j represent hydrophobic C/3. In the following, kT is given in 
dimensionless units, in which ehb = 2.8 and caa = 2.2. Further details of the model. 



including numerical values of all the parameters, can be found in Ref. ||26 
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Figure 1: Update of the chain defined in Sec. |^. Eight torsion angles 0i are turned. 
Turns such that the three atoms in the box are left unaffected are favored. Thick 
lines represent peptide bonds and the peptide torsion angles are fixed. 



In this model, we study a designed three-helix-bundle protein with 54 amino acids. 
In Ref. [|^, it was demonstrated that this sequence indeed forms a stable three-helix 
bundle, except for a twofold topological degeneracy, and that it has a first-order- 
like folding transition that coincides with the collapse transition. It should be noted 
that these properties are found without resorting to the widely used but drastic Go 
approximation |27], where interactions that do not favor the desired structure are 
ignored. 



3 The Algorithm 

We now turn to the algorithm, which we describe assuming the particular chain 
geometry defined in Sec. 0. That this scheme can be easily generalized to other types 
of chains will be evident. 

Consider a segment of four adjacent amino acids k, k + 1, k + 2 and k + 3 along 
the chain, and let the corresponding eight Ramachandran angles (see Fig. |1|) form a 
vector (j) = (01, . . . , (pn), where n = 8. A change 50 of (p will, by construction, leave 
all amino acids k' < k, as well as the N, H and Ca atoms of amino acid k, unaffected. 
For all amino acids A;' > A; + 3 to remain unaffected too, it is sufficient to require that 
the three atoms Cq,, C and O of amino acid k + 3 (see Fig. |I]) do not move. If this 
condition is fulfilled, the deformation of the chain is local. 

Denote the position vectors of the Ca, C and O atoms of amino acid + 3 by fj, 
1 = 1,2,3. A bias toward local deformations can be obtained by favoring changes 50 
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that correspond to small values of the quantity 



A^ = E(5f/f, (9) 

7=1 

which for small 5(j)i can be written as 

n 

^ 50^G(50 = HiG^j5<pj , (10) 

where 

G«=i:|J#. (11) 

^1 d(pi d(j)j 

Note that the three vectors fj can be described in terms of six independent parame- 
ters, since bond lengths and angles are fixed. This implies that the n x n matrix G, 
which by construction is non-negative and symmetric, has eigenvectors with eigen- 
value zero for n = 8 > 6. A bias toward small means that these soft modes are 
favored. 

We can now define the update, which consists of the following two steps. 



1. Draw a tentative new (/>, d)' from the Gaussian distribution 



00 = ^ ^ exp -(0' - 0)^A(0' - 0) , (12) 



where the matrix 



A=^(1 + 6G) (13) 

is a linear combination of the n x n unit matrix 1 and the matrix G defined 
by Eq. jlTJ The shape of this distribution depends on the parameters a > 
and b > 0. The parameter b sets the degree of bias toward small A^. The bias 
is strong for large b and disappears in the limit 6-^0. The parameter a is 
a direction-independent scale factor that is needed to control the acceptance 
rate. Larger a means higher acceptance rate, for fixed b. If 6 = 0, then the 
components 50j are independent Gaussian random numbers with zero mean 
and variance a~^. Note that W{(f) 0') ^ W{(f)' 0) since the matrix G is 
conformation dependent. 

Accept /reject 0' with probability 

^acc = min (^1, ^11^ exp[-{E' - E)/kT]\ (14) 
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for acceptance. The factor W{(f)' (t>)/W{(t> (f)') is needed for detailed 
balance to be fulfilled, since W is asymmetric. 



It should be stressed that this scheme is quite flexible. For example, it can be 
immediately applied to chains with nonplanar peptide torsion angles. The use of 
the concerted-rotation method for simulations of such chains has recently been dis- 



cussed [28 



A convenient and efficient implementation of the algorithm can be obtained if one 
takes the "square root" of the matrix A, which can be done because A is symmetric 
and positive definite. More precisely, it is possible to find a lower triangular matrix 
L (with nonzero elements only on the diagonal and below) such that 

A = LL^. (15) 

An efficient routine for this so-called Cholesky decomposition can be found in ||2^ . 



3.1 Implementing step 1 



Given the Cholesky decomposition of the matrix A, the first step of the algorithm 
can be implemented as follows. 



Draw a. ip = {ipi, . . . ,ipn) from the distribution P{ip) oc exp(— ■j/'"^?/'). The com- 
ponents ipi are independent Gaussian random numbers and can be generated, 
for example, by using the Box-MuUer method 

V'i = (-lni?i)^/2cos27ri?2, (16) 

where Ri and R2 are uniformly distributed random numbers between and 1. 

Given ip, solve the triangular system of equations 

1/54) = i' (17) 

for 50. It can be readily verified that the 5(j) = cp' — cj) obtained this way has 
the desired distribution Eq. |12|. 
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3.2 Implementing step 2 



The Cholesky decomposition is also useful when calculating the acceptance probabil- 
ity in the second step of the algorithm. The factor W{(j) — > 0') can be easily computed 
by using that 

n 

(detA)i/2^n^« (18) 

i=l 

and that exp[— (0' — 0)"^A(0' — 0)] = exp{—ijj'^ip). The reverse probability W^cj)' 4>) 
depends on A(0') and can be obtained in a similar way, if one makes a Cholesky 
decomposition of that matrix, too. 



3.3 Pivot update 



Previous simulations |]26[ of the model protein defined in Sec. g were carried out by 
using simulated tempering with pivot moves as the elementary conformation update. 
With this algorithm, the system was successfully studied down to temperatures just 
below the folding transition. However, the performance of the pivot update, where a 
single angle (pi is turned, deteriorates in the folded phase. What we hope is that the 
exploration of this phase can be made more efficient by alternating the pivot moves 
with moves of the type described previously. 



4 Results 



The character of the proposed update depends strongly on the bias parameter b. The 
suggested steps have a random direction if 6 = 0. The distribution W{(f) ^ 4>') in 
Eq. |T2| is, by contrast, highly asymmetric in the limit b ^ oo, with nonzero width 
only in directions corresponding to eigenvalue zero of the matrix G. In particular, 
this implies that the reverse probability W{(j)' (p) in the acceptance criterion Eq. n 
tends to be small for large b. 



For the acceptance rate to be reasonable, it is necessary to use a very small step 
size if b is small or large. The question is whether the step size can be increased 
by a better choice of b. To find that out, we performed a set of simulations of 
the three-helix-bundle protein defined in Sec. |] for different a and b. Two different 
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Figure 2: Average step size, {S), against average acceptance rate, (Pace), for different 
updates at (a) kT = 0.7 and (b) kT = 0.6. Shown are results for tlie b = b^i^^ (full 
lines), 6 = (dashed lines) and pivot (dotted lines) updates. 



temperatures were studied, kT = 0.6 and 0.7, one on either side of the folding 
temperature kTf ^ 0.66 p6[| . 



In these runs, we monitored the step size 5, where 



s = M 



.1=1 



1/2 



(19) 



for accepted moves and 5 = for rejected ones. Measurements were taken only when 
the n = 8 angles all were in the segment that makes the middle helix of the three. 
We focus on this segment because it is the most demanding part to update. 



The average step size, (S), depends strongly on b. A rough optimization of b was 
carried out by maximizing (S) as a function of a for different fixed b = 10^ {k integer). 
The best values found were &max = 10 (rad/A)2 and b^^^ = 0.1 (rad/A)^ at kT = 0.6 
and kT = 0.7, respectively. Note that the preferred degree of bias is higher in the 
folded phase. 

In Fig. ^ we show {S) against the average acceptance rate, (Pace), for 6 = and 
b = &max at the two temperatures; (Pace) is an increasing function of a for fixed b and 
T. Also shown are the corresponding results for the pivot update, where only one 
angle 0j is turned (5* = if the change is accepted). At the higher temperature, we 
find that the b = b^s.x and 6 = updates show similar behaviors. The pivot update is 
somewhat better and has its maximum (S) at low (Pace), where the proposed change 
6(j)i is drawn from the uniform distribution between and 2tt. This is consistent with 
the finding that the pivot update is a very efficient method for self-avoiding walks, 
in spite of a low acceptance rate. The situation is different at the lower temperature, 
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Figure 3: Distributions of (see Eq. ^) for the b = fomax (thick hne) and 6 = 
(thin hne) updates at kT = 0.6. The values used for the parameter a correspond to 
maximum (S). 



which is much harder to simulate. Here, the b = 6max update is the best. The 
maximum (S) is approximately three times higher for this method than for the other 
two. This shows that the biasing probability Eq. |T2| is indeed useful in the folded 
phase. 



The 6 = update can be compared with the moves used by Shimada et al. in a 
recent all-atom study of kinetics and thermodynamics for the protein crambin with 
46 amino acids. These authors updated sets of two, four or six backbone torsion 
angles, using independent Gaussian steps with a standard deviation of 2°. Our 6 = 
update has maximum {S) at a ~ 6400 (rad)~^ for kT = 0.6, which corresponds to 
a standard deviation of 0.7°. This value is in line with that used by Shimada et al, 
since we turn eight angles. 



How local is the method for 6 = 6max? To get an idea of that, we calculated the 
distribution of (see Eq. ||) for accepted moves, for 6 = 6inax and 6 = at kT = 0.6. 
As was previously the case, we restricted ourselves to angles in the middle helix. 
The two distributions are shown in Fig. ^ and we see that the one corresponding to 
6 = 6max is sharply peaked near A^ = 0. This shows that the 6 = 6max update is much 
more local than the unbiased 6 = update, although the average step size, {S), is 
considerably larger for 6 = 6niax- 

So far, we have discussed static (one-step) properties of the updates. We also esti- 
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Figure 4: The autocorrelation function C{t) (see the text) at kT = 0.6 for the b = 6max 
(□), 6 = (o) and pivot (O) updates. Step-size parameters correspond to maximum 
{S). 



mated the dynamic autocorrelation function 



^ / ,N ^ (cos0i(t) COS 0^(0)) - (cos 0.^(0))^ 

(COS2 0.(O))-(COS0,(O))2 ^ ' 

for different 0j. This measurement is statistically very difficult at low temperatures. 
However, the sixteen most central angles (pi in the sequence, all belonging to the 
middle helix, were found to be effectively frozen at kT = 0.6, and the time scale 
for the small fluctuations of these angles about their mean values was possible to 
estimate. In Fig. ^, we show the average Ci{t) for these sixteen angles, denoted by 
C{t), against Monte Carlo time t, for the b = 0, b = fomax and pivot updates. One 
time unit corresponds to one elementary move, accepted or rejected, at a random 
position along the chain. We see that C{t) decays most rapidly for the b = fe^ax 
update. So, the larger step size of this update does make the exploration of these 
degrees of freedom more efficient. 

Let us finally comment on our choice to work with n = 8 angles. This number can 
be easily altered and some calculations were done with n = 6 and n = 7, too. For 
n = 6, the performance was worse, which is not unexpected because there are no soft 
modes available; there are not more variables than constraints. The results obtained 
for n = 7 were, by contrast, comparable to or slightly better than the n = 8 results. 
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5 Discussion 



Straightforward Monte Carlo updates of torsional degrees of freedom tend to cause 
large changes in the global structure of the chains unless the step size is made very 
small, which is a problem in simulations of dense polymer systems. The strictly 
local concerted-rotation approach provides a solution to this problem but is rather 
complicated to implement. In this paper, we have discussed a method that may be less 
powerful but is much easier to implement, which suppresses rather than eliminates 
nonlocal deformations. 

The method is flexible and not much harder to implement than simple unbiased 
updates. However, compared to such updates, it has two distinct advantages: the 
step size can be increased and the update becomes more local, as shown by our 
simulations of the three-helix-bundle protein in its folded phase. 

Making the update more local is important in order to be able to increase the step 
size and thereby improve the efficiency. At the same time, it makes the dynamics 
more realistic; the proposed method is, in contrast to the other methods mentioned, 
tailored to avoid drastic deformations both locally and globally. Therefore, although 
this paper was focused on thermodynamic simulations, it should be noted that this 
method may be useful for kinetic studies, too. 
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